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Abstract 



The structure and size of the interleaver used in a turbo code criticahy affect the distance spectrum and the covariance 
property of a component decoder's information input and soft output. This paper introduces a new class of intcrleavers, 
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: the inter-block permutation (IBP) intcrleavers, that can be build on any existing "good" block- wise interleaver by simply 
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^ size. The increased effective interleaving size improves the distance spectrum while the reduced covariance enhances the 

o ■ 

■ iterative decoder's performance. Moreover, the structure of the IBP (-interleaved) turbo codes (IBPTC) is naturally fit 
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adding an IBP stage. The IBP intcrleavers reduce the above-mentioned correlation and increase the effective interleaving 
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for high rate applications that necessitate parallel decoding. 

We present some useful bounds and constraints associated with the IBPTC that can be used as design guidelines. 
The corresponding codeword weight upper bounds for weight-2 and weight-4 input sequences are derived. Based on some 
of the design guidelines, we propose a simple IBP algorithm and show that the associated IBPTC yields 0.3 to 1.2 dB 
performance gain, or equivalently, an IBPTC renders the same performance with a much reduced interleaving delay. The 
EXIT and covariance behaviors provide another numerical proof of the superiority of the proposed IBPTC. 
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I. Introduction 

The turbo code's extraordinary performance is in part due to a class of suboptimal iterative decoding 
algorithms that generate soft outputs based on the maximum a posteriori (MAP) principle. At each 
decoding round, an a posteriori probability (APP) decoder provides extrinsic information for use in 
the ensuing round as the a priori information. The extrinsic information about an information bit 
is gathered through message-passing from the channel output samples corresponding to other related 
bits, and its derivation is based on the structures of the interleaver and the component codes as well 
as the statistical property of the channel. The interleaver is thus an integrated and critical component 
of a turbo code (TC) and its importance has been well documented. 

As the interleaver is of finite size, say, L bits, the input information sequence is segmented into 
blocks of L bits, leading to separate encoding, interleaving and decoding of each block. As a result, the 
message-passing process induced by the interleaver is confined to within a block. Given the component 
codes and the decoding algorithm, performance can be improved by increasing the interleaver (block) 
size. 

At the cost of prolonged interleaver delay, the increased block size not only enables the decoder to 
gather the extrinsic information from more (noisy) data and parity samples but also makes it easier to 
reduce the covariance between the extrinsic (information) outputs and the corresponding information 
inputs, which, as suggested by Hokfelt et al. [7] and Sadjadpour et al. [8], is a desired attribute 
of the interleaver /deinterleaver pair. Intuitively the less covariance between information input and 
extrinsic output the more 'new' information the extrinsic output would carry over to the new APP 
decoding round, where two consecutive decoding rounds constitute a decoding iteration. Increasing 
the interleaver size also provides greater flexibility in designing a better permutation rule to avoid 
generating low weight codewords and to improve the free distance and spectrum properties of the 
associated turbo code [19], [20]. 

In short, the interleaver and the component codes' structures together determine the distance spec- 
trum of the code and, through the message-passing process embedded in the iterative decoding proce- 
dure, determine the bit error performance. In this paper, we present a new interleaver structure that 
serves both purposes without increasing the total interleaving delay if certain degrees of parallelism are 
allowed and a proper decoding schedule is in place. 

Before formally defining our interleaver structure, we give a tutorial explanation for its effectiveness 
in expanding an iterative decoder's message-passing range. Consider the interleaver structure shown 
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in Fig. 1. A data sequence is partitioned into L-bit blocks, each block being represented by a rect- 
angular. Information bits are first permuted within their respective blocks. After such intra-block 
permutations, bits within a block are further systematically permuted with those in other neighboring 
blocks. It is obvious that the additional inter-block permutation (IBP) leads to progressively larger 
ranges for message-passing as the number of decoding iteration increases. Intuitively, when the extrin- 
sic information about a particular information bit is gathered from more and farther data and parity 
samples, it becomes more reliable and less correlated with the information bit. Let us elaborate on 
this message-passing process associated with the iterative decoding of an IBP-interleaved turbo code 
(IBPTC). 

In general, a turbo code may consist of several parallel constituent codes and interleavers. For 
simplicity of presentation, however, we will consider only the classic turbo codes that use two identical 
recursive systematic convolutional component codes and one interleaver in the subsequent discussion. 
Fig. 2 is a graph representation that illustrates the behavior of such a classic turbo code that uses an 
IBP interleaver (IBPI). In this figure, nodes Ui, U[ represent a data block of length L and its permuted 
version. X°,X/,Xf denote the uncoded (systematic part) and two encoded blocks while Y^^Y^.Y^ 
are the corresponding received blocks. The bold solid lines connecting Ui and U[ indicate intra-block 
permutations that confine permuted bits to within the same block. An IBP that permutes bits of a 
given block with some of those bits within the two immediate adjacent blocks and the original block 
is represented by the dotted lines connecting Ui and {f/j+i, f/^, f/j-i}. Both solid and dotted lines also 
represent directions of information flow to and from a node during an iterative decoding process. 

Take node U^ as an example and, for the convenience of describing the decoding process, refer to the 
decoder responsible for decoding node Ui or the corresponding codeword [X^^XlY the first APP 
decoder and that in charge of decoding f// or its associated codeword the second APP decoder-alt hough 
physically they might be the same one. After the first APP decoding of node t/3, the associated extrinsic 
information is passed to nodes U'21 U^ and U'^ by the interleaver. When the second APP decoder is 
decoding the U2 (f/4) block, it uses extrinsic information derived from f/3, along with that derived 
from Ui and U2 (f/4, U5), as the a priori information to generate new extrinsic information, after de- 
interleaving, to nodes {Ui, U2, U3} ({f/3, f/4, f/s})- We see that, after just one iteration the information 
associated with the f/3 block has already been passed to portions of four neighboring blocks and f/3 itself. 
Therefore, when the first APP decoder is decoding node f/3 for the second time it can use information 
collected from these four adjacent blocks. As one proceeds with further iterations, more information 

^For brevity, we shall make no distinction between Ui (or Ui) and its associated codeword if there is no danger of confusion. 
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will be available for the APP decoder while, as we shall see in the next section, the interleaving delay 
can be kept at a fixed 2L bits, and, more importantly, the average decoding delay per block can remain 
invariant if certain conditions are satisfied. 

This last property is critically important, for it implies that an IBP interleaver can have an unbounded 
equivalent interleaving depth (size) that is constrained only by the number of turbo decoding iterations 
and the data sequence duration while keeping the average interleaving and decoding delay bounded by 
its local interleaving depth. 

The rest of this paper is organized as follows: In Section II, we introduce the structure of IBP 
interleavers and define the related parameters. In section III, latency and implementation issues are 
discussed. Some important codeword weight bounds and intra-block interleaver constraints associated 
with the IBPTC are derived in Section IV. When deriving these bounds, we first assume no knowledge 
of the intra-block permutation then obtain some constraints on the intra-block permutation to avoid 
producing low-weight codewords. In Section V, we consider the problem of joint intra-block and inter- 
block permutation design and derive an upper-bound for the achievable minimum distance. Some of 
the properties derived in Sections IV and V can be used as the IBPI design guidelines. Architecture 
for realizing IBPTC and an IBP algorithm are presented in Section VI, so is a modified semi-random 
interleaver which serves as the intra-block interleaver. In Section VII, we provide simulation results to 
validate our assertion that our proposal does yield significant improvement. The final section summa- 
rizes our work and gives some concluding remarks. 

II. Inter-block permutation Interleaver 

A. Interleaver description 

Let u= {uk}^°^^ be an input data sequence and tt be the interleaver that maps u into u'= {u'f^^\z°^^ 
such that = Mfc, where 7r(/c) is the permutation rule corresponding to tt. vr"^ denotes the per- 

mutation of the de-interleaver of vr, representing the inverse mapping that maps u' back to u, i.e., 
^7r-i(fc) = where ir'^ik) is the inverse permutation rule corresponding to vr^^. Denote the block 
interleaver by Huock and the corresponding permutation rule by iTuockik), < k < L, L being the 

def def 

length of the block interleaver. Using the abbreviated notations = k mod L and \k\L = [k/L\ 

where [xj is the integer part of x, we denote the intra-block interleaver and the inter-block interleaver 
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by TTintra and TTinter and define the corresponding permutation rules by 

Tfintraik) = L\k\L + TlblockiWkWL) (1) 
7linter{k) = L{\k\L + fin{k))+U{k). (2) 

Similarly, the inverse intra-block and inter-block permutation rules are defined by 

TT-iM = L\k\L + 7r^iU\\k\\L) (3) 

^interik) = L {\k\L + ft{k)) + ft,{k) , (4) 

The IBP rule is characterized by two functions, fib{k) and fin{k) and its inverse is characterized by 
ffbik) and ff^ik)] all four are integer-valued functions. < fihik), ff^^ik) < L represent the relative 
positions within a block after the inter-block interleaving and deinterleaving. fin{k) and ffn{k) are 
block indicator functions that determine to and from which block the ki\i bit is moved by these two 
operations. fin{k) = or ffn{k) = means that the kth symbol remains in the same block and the 
range constraints, —Sb < fin{k) < Sf and —Sj < ffn{k) < Sb, define the forward and backward 
interleaver spans Sj and Sb- 

An IBPI is then defined by Hibpik) = Ttmteri'^intraik)) , the composition of the intra- and inter-block 
permutation rules, while the IBP de-interleaver (IBPDI) is the composition of the corresponding inverse 
rules, n-f^l^k) = nr^trai^interik)) ■ They can be expressed as 

T^ibp{k) = fin{k)L + fib (k) , (5) 



^;,l{k) = ffn{k)L + ffb{k), (6) 

where 

'^intra m , (7) 

fib{k) = fib {TTintraik)) , (8) 

ftik) = \k\L + ft{k), (9) 
M{k) = <ck{fi{k))- (10) 

B. Interleaving and deinterleaving delays 

Define the interleaver delay, D^, and the deinterleaver delay, Dd-, by 

= maxj/c — 7r(/i;)}, = niaxjA; — 7r~^(A;)}. (11) 

k k 
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The interleaving and deinterleaving delays per decoding round of an IBPI are bounded by 
DijBP = ma.x{k - 7ribp{k)} = maxlk - fin{k)L - fib{k)\ 

k k (. J 

= max{^k-{\k\L-Sb)L- U{k)^ <{S, + 1)L (12) 

and 

DdJBP = max{fc - 7i-t^{k)} = max |/c - //„(/c)L - ff^{k)^ 

= max\^k-{\k\L-Sf)L- Uik)j <{Sf + l)L. (13) 

A fully dispersed interleaver (deinterleaver) is one that achieves the corresponding upperbound, and a 
symmetric interleaver (deinterleaver) is one with the same forward and backward spans, S J — Si) — 5*. 

C. Special inter-block interleavers 
Five IBPIs deserve special attention. 

Definition 1: If fib{k) = ||/c||l,V k, then the corresponding inter-block interleaver is called a Type I 
(inter-block) interleaver. 

Definition 2: An inter-block interleaver is a Type II (inter-block) interleaver if fin(k) = fin(k + 
nTg), V n, L\k\L < k,k -\- uTg < L(\k\L + 1), where Tg = Sf + Siy + 1, and the integer-valued function 
fin(k) is injective within a period. 

Definition 3: An inter-block interleaver is a Type III (inter-block) interleaver if ffn(k) = fin(k + 
riT,), V n and L\k\L < k + riTg < L(\k\L + 1), where Tg = Sf + Sb + 1, and the integer-valued function 
ffn(k) is injective within a period. 

Definition 4: An inter-block interleaver that possesses all the properties of the Types I, II, ///(inter- 
block) interleavers is a Type IV (inter-block) interleaver. 

Definition 5: An interleaver is a swap interleaver ii"^ i, 7r(z) = j ^ 'n'(j) = i, i.e. V i, 7r(i) = 7i~^(i). 

Each category of interleavers has some desired properties (e.g., Types II and /// interleavers are 
locally or blockwise periodic) that will be proved useful in our interleaver design. For Type I interleaver, 
fib(k) = TTbjocfcd 1^1 U); i-G., the relative coordinate of a symbol within a block is invariant to the inter- 
block permutation. Definition 2 implies that Type II interleavers are periodic and \\fin('^intra(x)) — 
fin('^ intra {y))\\Ts = IkMocfcdklU) - TTbiock(\\y\\L)\\Ts- We also note that the swap interleavers have a 
simple symmetric structure that requires less storage size for implementation. 
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III. Latency and implementation concerns 

Although the interleaving process of an IBPI is defined by the composition of the intra- and inter- 
block permutations, it can be implemented by a single step. The encoder knows to which position 
each bit (or sample) in a given block should be moved and can do so immediately after it receives each 
incoming bit. But to encode a given, say the ith, interleaved block Ul into Xf, it has to wait until the 
complete {i + S)th block is received. In a sense, IBP is a non-causal operator. The time elapsed between 
the instant the encoder receives the first bit of the ith block and the moment when it receives the last 
bit of the (i -|- S')th block and outputs its first encoded bit of Xf is simply (1 -|- S')L-bit durations. By 
contrast, a classic TC with a block size of L bits has an encoding delay of approximately L bits. 

The interleaving (or deinterleaving) delay per decoding round, or the single-round interleaving delay 
(SRID) for short, is proportional to the encoding delay. But the decoding delay of an IBPTC decoder 
is a much more complicated issue. For the first decoding of each received block, there can be zero 
waiting time, but for later decoding rounds the corresponding delays depend on, among other things, 
the decoding schedule used. With the same block size, the decoding delay of the first received block for 
the classic TC is definitely shorter than that for the IBPTC. But if one considers a period that consists 
of multiple blocks (otherwise one will not have enough blocks to perform inter-block permutation) 
and takes the decoding schedule into account, then the average decoding latency difference can be 
completely eliminated. This is because the APP decoder (including the interleaver and deinterleaver) 
will not stay idle until all blocks within the span of a given block are received. Instead, the APP 
decoder will perform decoding-interleaving or deinterleaving operations for other blocks according to 
a predetermined decoding schedule before it can do so for the given block (and the given decoding 
round) . 

If we define the total decoding delay (TDD) as the time span between the instant a decoder receives 
the first input sample (from the input buffer) and the moment when it outputs its last decision then 
both the IBP and classic approaches yield the same TDD even if only one APP decoder is used. We 
use the following example and Fig. 3 to support our claim; its generalization is straightforward. 

Suppose we receive a total of 7 blocks of samples (in a packet, say) and want to finish decoding 
in 2 iterations (4 decoding rounds). One can easily see from Fig. 3 that a classic TC decoder would 
output the first decoded block in 4 DT cycles, where DT is the number of cycles needed to perform a 
single-block APP decoding plus SRID, while the IBPTC decoder needs 10 DT cycles to output its first 
decoded block. However, if one further examines the decoding delays associated with the remaining 
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blocks, then one finds they are 8, 12, 16, 20, 24, and 28 DT cycles for the classic TC decoder while 
those for the IBPTC decoder are 14, 18, 22, 25, 27 and 28 DT cycles, respectively. So in the end, both 
approaches reach the final decision at the same time. 

In general, except for the first block and the last 2N — 1 blocks, both decoders result in a constant 
delay of 2N DT cycles between two adjacent output blocks, being the number of decoding iterations. 
The IBPTC decoder with S* = 1 requires a first-block decoding delay (FBDD) of A^(l + 2A^) DT cycles 
while the FBDD for the classic TC is only 2N DT cycles. The inter-block decoding delays (IBDD, 
i.e., decoding latency between two consecutive output blocks) for the last 2N — 1 output blocks of an 
IBPTC decoder using a decoding schedule similar to that shown in Fig. 3 form a monotonic decreasing 
arithmetic sequence {2A^ — 1, 2N — 2, ■ ■ ■ , 1} (in DT cycles). The IBDD of a classic TC decoder remains 
a constant 2N DT cycles. On the average, both codes give the same IBDD. 

The above assessment on the decoding delay is made under the assumption that both codes use the 
same block size L and a single APP decoder is used. As was mentioned in Section I, an IBPTC has an 
equivalent interleaving depth that grows as the number of iterations increases. We will show later that 
with an identical block size an IBPTC always outperforms its classic counterpart. In other words, an 
IBPTC requires a smaller block size and thus less decoding delay to achieve the same performance. 

We now suggest an alternative viewpoint on the concepts of IBP based on the the above example. 
For a classic TC with block size of 7L bits, the FBDD is 28 DT cycles. But if one divides a 7L-block 
into 7 subblocks and a special interleaver which performs successive intra-subblock and inter-block 
permutations on these subblocks, the corresponding decoding delays in DT cycles for these subblocks 
are 14, 18, 22, 25, 27, and 28, respectively. Therefore, although both code structures result in identical 
TDD the IBPTC structure is able to supply partial decoded outputs much earlier. 

Since the classic TC can use a better interleaver of depth 7L and provide better performance, the 
class of IBPTCs offers an alternative tradeoff between decoding delay and performance. The relative 
performance degradation can at least be partially recovered by increasing the number of blocks (or 
subblocks) involved in decoding and the decoding latency can be further reduced by employing multiple 
APP decoders; see Fig. 4 for a typical decoding schedule using 4 APP decoders. The regularity feature 
of an IBPI makes IBPTCs naturally suited for parallel decoding-which is often a necessity in high 
data rate applications. By contrast, parallel decoding of a classic TC is very likely to have difficulty 
in finding a memory efficient solution for the memory bank access collision (i.e., more than one APP 
decoder try to access the same information simultaneously) problem. 

Fig. 5 (a) shows an IBPTC decoding module for one iteration. The parenthesized numbers in each 
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block indicate the corresponding latencies. A pipeline structure similar to that proposed by Hall [3] is 
shown in Fig. 5(b). The pipeline structure renders short decoding latency at the expense of increased 
complexity that is proportional to the number of the decoding modules. 

But the pipeline decoder architecture is not necessarily the optimal solution in terms of performance 
and average decoding latency. It has been shown [9] that with an appropriate early-termination scheme, 
multiple, say M (2 < M < number of decoding rounds) APP decoders and proper scheduling, one 
can improve the BER performance of an IBPTC and reduce the associated average iteration number. 
Although early-termination schemes can also be applied to classic TCs, not much performance gain is 
expected and the reduction of the average decoding iteration number is limited since early-terminated 
blocks are unable to pass the highly reliable extrinsic information output to un-terminated blocks. 

[9] further shows that the structure of the IBPTC renders its decoding amenable for highly dy- 
namic decoding schedules that are both distributive and cooperative: sharing all modularized decoding 
resources-the APP component decoders, interleavers/deinterleavers, memory-while passing informa- 
tion amongst component decoders. Depending on the decoding schedule and the degree of parallelism, 
the IBPTC admits a variety of decoder architectures that meet a large range of throughput and perfor- 
mance demands. Its performance can be improved by using a proper decoding schedule, increasing the 
block size, the interleaving span, the number of decoding iterations and the number of blocks involved 
in decoding. The availability of these options are indications of the flexibility and versatileness of the 
IBPTC. 

We also want to remark that our proposal is also an attractive option for applications like deep space 
communication systems in which the earth decoder can enhance its BER performance by increasing 
the number of iterations (thereby increasing the equivalent interleaving size and the FBDD) while the 
encoder in the space segment remains intact. 

The issues of IBPTC decoder architecture, the associated decoding schedule and memory manage- 
ment, though interesting and worthy of detailed investigation, are beyond the scope of this paper and 
will not be discussed henceforth. 

IV. IBP PROPERTIES 

A. Basic definitions and results 

Consider a turbo code C that consists of m rate 1/2 recursive systematic component codes. Denote 
by u = {■ ■ ■ , Mo, Ml, M27 ■ ■ ■ } a binary input sequence, and by X = {x°, x^, ■ ■ ■ , x"*} the codeword 
associated with u, where x* is the output parity-bit sequence of the ith component code while x" = u 
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represents both the input sequence and the systematic (uncoded) output sequences. A minimum 
weight codeword thus consists of the sequences {x^^^^, z = 0, 1,-- - ,m}. For a bi-infinite sequence 
X* = {■ ■ ■ , x^_2, x^-i, Xq, x\, X2, ■ ■ ■ }, the sequence x^ = {■ ■ ■ , 0, 0, x\.j^, ■ ■ ■ , 0, ■ ■ ■ } is 

called the kth block-matched (BM) sequence of x*. We consider the special case m = 2 only and refer 
to the corresponding turbo code using a conventional block-wise interleaver as a classic TC Cb- 

Depending on the hypertrellis connections between the adjacent blocks, we can use one of the three 
encoding/decoding options, namely continuous, truncated and terminated 'co-decoding' [6]. In accor- 
dance with these options, we define a continuous IBPTC (C-IBPTC) as one that encodes each data 
block using the end state of the previous coded block as the initial state and adds the tail-bits only 
for the last data block. On the other hand, a discontinuous IBPTC (D-IBPTC) encodes each data 
block individually, either by appending the tail-bits at the end of a block or by using the tail-biting 
encoding. We refer to the former class as the tail-padding IBPTC (TP-IBPTC) while the latter class 
as the tail-biting IBPTC (TB- IBPTC). 

The following theorem specifies the conditions under which the free distance of a C-IBPTC will be 
greater than or at least equal to that of its corresponding classic TC. 

Theorem 1: For a classic TC C;,, the corresponding C-IBPTC Cibp based on (5)-(10) has a free 
distance greater than or equal to that of Cf, if a Type I inter-block interleaver is used and all BM 
sequence pairs of a minimum weight codeword of Cj^p, {x^j„,i = 0, 1,2}, are also valid codewords of 
the corresponding component codes. 

Proof: For a C-IBPTC Cjbp, there exists at least a finite- weight data sequence Umin = 
whose corresponding codeword has the minimum weight. Suppose the nonzero elements of Umm are at 
positions {ki, k2, ■ ■ ■ , kn} and the corresponding codeword is X^m = x^, x^}. We partition x* into 
blocks of equal length L bits and construct the associated BM sequences. The jth BM sequence and its 
IBP interleaved version generate, for the two component codes, the encoded parity-bit sequences, x] 
and x| with Hamming weights tyt(x]) and U7t(x|), respectively. The systematic parts of both component 
codes are the same and the corresponding BM sequences are denoted by x°. If fib{k) = taking 
module L on 7iibp{ki) gives 

\\'n-ibp{ki)\\L = \ \Lfin{ki) + fib{ki)\\L = \ \fib{'n-intra{ki))\\L 
= \\'n-intra{ki)\\L = \ \L\k\L + 7lblock{\\ki\\L)\\L = -^blockillkiWl) ■ (14) 

Let Ml be the permutation defined on the space of all binary BM sequences that moves the matched 
length-L block of a BM sequence to the Ith block, i.e., : x^ ^ x^, V k. As the 2-tuple (x°, x]) is a valid 
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codeword of the first component code of Cj^p according to our assumption, ^0^. M;(x°), 0^. M;(xj) 
denoting addition of binary vectors, is also a codeword of the same component code. (14) imphes 
that ^0^- Af;(x°), 0^- M;(x^) j are vahd codewords for the second component code of Cibp and C;, since 
the additional IBP does not change the relative positions of input bits within a block, where x^ is the 
IBP-interleaved version of x°. The inequality 

Wi(x;.) + wt{xi) = WtiMiix"^)) + WtiMiix'i)) > wt (M^x}) © M^xj)) , VjV ^ (15) 

then implies that the free distance of Cj^p, dfree{Cibp) satisfies 

dfree (Cibp) = 

i j i \ j / 

u 

For a D-IBPTC, the "sub-codewords" associated with each input block automatically satisfy the re- 
quirement on the BM sequences. Since both tail-padding and tail-biting convolutional codes are linear 
codes, we have 

Corollary 1: For a classic TC C^, the corresponding D-IBPTC Cibp based on (5) has a free distance 
greater than or equal to that of C;, if Hmter is a Type I inter-block interleaver. 
Define two equivalent relations "~" and "=" on the set of integers Z by 

IK-jIIt, = i~j 
V\l = \3\l ^ ^=J 

where i,j G Z and is the period (to be defined in the next paragraph) of a recursive systematic 
convolutional (RSC) code. Clearly, i ^ j oi i ^ j means i is equivalent to j in either sense. 

[26] and [28] show that the encoder of an RSC code acts like a scrambler or equivalently, a linear 
time-invariant IIR filter, on the input sequence and can be realized by using a shift register with both 
feedback and feedforward branches. It is obvious that such an encoder would have a periodic impulse 
response. The rate 1/2 RSC code is specified by the transfer matrix [1, gf{D) / gb{D)], where gb{D) is 
usually a primitive binary polynomial of degree m. The period of the impulse response of the non- 
systematic part, gf{D)/gb{D), is given by whose maximum value is 2"* — 1. We denote by Ujj = {uk} 
a weight-2 input sequence whose only nonzero elements are at coordinates i and j; the corresponding 
codeword is denoted by Xjj. Therefore, Tc is also the smallest integer such that Ujj, i ^ j, will generate 
a finite-weight output parity sequence. It is thus easy to show [26] 
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Lemma 1: Let Uy be the input sequence to a scrambler with period T^. and scrh{\iij) be the cor- 
responding output parity sequence. If z ~ j, then there exists a E Af and P E Z such that 
Wt {scrb{uij)) = a\i — j\/Tc + /?, where Af is the set of positive integers and a,P depend on the 
encoder (scrambler) structure. 

Obviously, if z j, Uij will generate an infinite weight parity sequence if there is no termination at the 
end of a block. Lemma 1 implies that the codeword weight, Wt(X.ij), of a classic rate 1/3 turbo code 
satisfies 

Wt{Xij)>2 + a\ (17) 

with equality holds iff 

i ~ j and 7r(i) ~ 7r(j). (18) 
5. Bounds on codeword weights associated with weight-2 input 

_ def def 

Define W2,min = min(ij)gs„^wt(Xjj), where Sm = {(^,j)K ^ J^T^ibpii) ~ vri;,p(j)} and let 

= min [\i - j \ + |vribp(z) - 'Kibp{i)\] ■ (19) 

For the class of C-IBPTCs, W2,min = uj2,min iiiin(j ,,)W((Xjj), therefore, maximizing the minimum 
weight of the codewords associated with the weight-2 input sequences is equivalent to maximizing 
6min- The next theorem provides an upper-bound of W2,min any IBPTC can achieve, if choosing the 
intra-block interleaver is not an option. 

Theorem 2: For an IBPTC using an inter-block interleaver, iiinter, there exists an Tiintra such that 
W2,min < 2 + aiSf + S, + 2) + 2/5, if L > T,{Sf + 3^ + 1). 

Proof: Consider the partition {0, 1, ■ ■ ■ , L — 1} = IJj=o U/=-5j, '^'j?, where Sji = {m\m G 
SjJfni^) = 0> Sj = {n\n ^ ], ^ n < L}, < J < Tc and -Sb < I < Sf. Note that the de- 
composition Sj = Sji is induced by the function /^^(m) or equivalently, by /j„,(m). Obviously, 
the codeword weight of the weight-2 information sequence u^-iCmW-Vn) large, if u'^^ with m ^ n. 
As we are concerned with W2,mini only those weight-2 sequences with nonzero coordinate pairs in the 
set, {(m, n)|m ~ ri ~ j, for some j and m,n E Sji for some /} have to be considered. 

Assume that V j,l all pairs {(m, n) G Sji} satisfy the inequality \m — n\ > Tc{Sf + Sb + 1). For 
any pair (m, n) E Sji,m < n and the associated interior set V = {m + l,m + 2,-- - ,n — 1}, we have 
\V\ ^ Tc ■ {Sf + Sb + 1). If Sji nV (p, there exists a pair {m' ,n') G Sju where \m' — n'\ < |m — n\. 
Otherwise, if Sjir\V = 0, by the pigeonhole principle [29], there exists a set Spg such that \Spg f]V\ ^2, 
which implies that there is a pair (m', n') G 5*^^, where \m' — n'\ < \m — n\. 
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As both cases lead to contradictions, we conclude that there exists a pair {m,n) G Sji for some 
such that \m—n\ < Tc{Sf+Sb+l). Since it is always possible to find Hintra such that \'^ii)^{'m')—'^ibpi^)\ ~ 
T„ (17) and (18) then imply that W2,m.in < 2 + a{{T^ + T^iSf + Sb + l))/r,) + 2/5. When Sf = Sb = S, 



As all data sequences are of finite length in practice, there are either no or not enough blocks for the 
first Sb — 1 and the last 5*/ — 1 blocks to perform either the complete backward or forward inter-block 
permutations. Therefore, we have to modify the IBP range for those blocks by reducing either the 
forward or the backward span. Assuming that there are ^ max(S'/, Sb) blocks and denoting by 
Sf{i) and Sb{i) the forward and backward spans of the ith block, we require that for < i < A^, 



Theorem 2 is modified accordingly. 

Corollary 2: For finite- length data sequences and a given inter-block inter leaver, Timter-, whose spans 
are specified by (20), 3 nintra such that the corresponding IBPTC satisfies W2,min < 2 + a ■ min(S'/ -|- 
2, Sb + 2) + 2p, if L > Te ■ mm{Sf + l,Sb + 1). 

Theorem 2 and its Corollary indicate that lack of control on the intra-block interleaver imposes an 
upper-bound for W2,m.in an IBPTC can achieve. The coordinates of nonzero elements of the interleaved 
sequence u'^^ with i = j will either remain in the same block or be in the different blocks with proba- 
bilities close to 1/(25* + 1) and 2S/{2S + 1) when considering all possible intra-block interleavers and 
assuming Sj = Sb = S. The resulting codewords for the latter case are very likely to have large weights 
while those for the former case have smaller weights with the worst-case weight of 2 + 2a; + 2/5 only. 

To avoid generating low weight codewords for Uj^, we first notice that (17) implies W2,min > 2 + 
o.{^min/Tc) + 2/3. The IBP along with the intra-block interleaver determine the relation between |« — j| 
and \nibp{i) — vrjf,p(j)|, and their structures can be optimized to maximize 5min- For a pair of coordinates 
(i, j) G Sm, if the integer- valued function ff„{k) is injective and satisfies the locally-periodic property, 
fini^) — fini^~^''^'^s), for L\k\L < n + kTg < L(|/c|i + 1), where = Sf + Sb + l, then the requirements, 
i =j and TTibpii) = HibpU) imply Wnibpii) - Tribp{j)\\Ts = and therefore \7iibpii) - T^ibpU)] > lcm{Tc,Ts), 
where lcm{a, b) represents the least common multiple of a and b. In other words. 

Lemma 2: An IBPTC that uses a Type III inter-block interleaver satisfies 



we have W2^min < 2 + 2a{S + 1) + 2j3. 



Sf{i) = min {Sf, N — i + 1) , Sb{i) = min {Sb, i) ■ 



(20) 



mm 



WtCXij) >2 + a{[T, + lcm{T,, T,)] /TJ + 2/3. 



(21) 
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If Tc and Tg are relative prime, then 

min Wt(Xi,) > 2 + a{Sf + Sb + 2) + 2(3. (22) 

C. Constraints on the intra-block interleavers 

Theorem 2 reminds us of the importance of a judicious choice of the intra-block interleaver. For 
the question of how to choose an intra-block interleaver whose associated W2.min is guaranteed to 
surpass the worst-case upper-bound of Theorem 2^ Lemma 2 gives only an unrefined answer. We need 
more elaborate constraints on the selection of the intra-block interleaver to avoid producing a ti'2,mm 
smaller than that bound. In general, any one of the four conditions, i ^ j, 7Tibp{i) ^ T^ibpij), i ^ j, 
T^ibpii) ^ T^ibpU), is very likely to result in large Wf(Xjj). However, there is still a small possibility that 
low weight codewords will be generated. Before presenting the requirements for eliminating these low 
weight codewords by using a proper intra-block permutation, we need to define a few new functions to 
facilitate our discussion. 

We denote by the weight-1 sequence whose only nonzero element is at coordinate k and by scrb{-) 
the RSC encoder that encodes a length-L sequence and terminates at the all-zero state using proper 
tail-bits. Based on the above definitions, we further define, for < i, j < L 

fiihj) = { _ (23) 
Wt{scrb{uij)), otherwise 

f2{hj) = Wt{scrb{ui)) + wt{scrb{uj)) (24) 

{\i — j\, if i ~ j 
(25) 
oo, otherwise 

f,{t,j) = minihit, j),Ut,j + L),Ut,j-L)). (26) 

As the way a low weight codeword is generated depends on how the encoder terminates its state at the 
end of a block, we begin with the TP-IBPTCs. 

C.l TP-IBPTC 

For the class of TP-IBPTCs, a weight-2 input sequence Uj^, ^ Sm, can not generate an infinite- 
weight codeword because the encoder state is forced to be terminated at the all-zero state at the end 
of each block. On the other hand, low-weight codewords may be generated if 

diii) + diij) + dL (TTibpii)) + dL {TTibpU)) < lcm{Ts,Tc) + T^ (27) 
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where (ii,(n) = L — ||?7.||2, and in addition, (i) both i,j and ^^^{1), 'nihpU) are near the ends of different 
blocks, (ii) i = j, Hibp{i) = 7Tibp{j) and both pairs he close to the end of a block, or (iii) i ^ j 
or 7iihp{i) ^ T^ibpU) but both pairs lie close to the end of a block. To avoid generating low weight 
codewords out of case (i), we require that 

^2m\LM\L) + ^2U^b{mL + ^^Uockm\L))J^b{mL + nuoc^^^^^^ (28) 

where -B(Tc, Tg) = 2 + a [Tc + lcm(Tc, Ts)/Tc] + 2/5. Similarly, for cases (ii)-(iii), nuock must satisfy 

fim\LA\j\\L) + fliUimL + TTUockmiL)) J^b{mL + nuoc^^^^^^ (29) 

if 2 = j and iTibpii) = iTibpij), and 

fim\LA\j\\L) + f2{U{mL + nblockmiL)) J^b{mL + n^^^^^^ (30) 

if 2 ^ j but Tiibpii) ^ TTibpij), and 

f2m\LA\j\\L) + fliUimL + nblocMlL)) J^b{L\J\L + nbloc^^^^^^ (31) 

if z ^ j but 7iibp{i) = HibpiJ). 

The above conditions (28)-(31) are not too easy to meet but can be relaxed if we impose more 
constraints on ti inter- It is straightforward to show 

Lemma 3: For an TP-IBPTC whose inter-block interleaver is of Type IV 

min Wt(X.y) > B{Tc,Ts) 
if each element in the set Tt^ = : < z, j < L — 1, \ \nbiock{i) — T^biock{j)\\Ts = 0} satisfies 

f2{id) + f2{T^block{i),T^block{i)) > B{Tc,Ts) 

fiihj) + fii-^biockii),TTbiockij)) > B{Tc,Ts) (32) 
and V ^ Tt^ the following two inequalities are satisfied 

fiihj) + f2iniockii),niockij)) > B{Tc,Ts) 

f2{id) + fi{'^biock{i),'^biock{i)) > B{Tc,Ts). (33) 

Starting with an arbitrary intra-block interleaver, say an s-random interleaver [10], we can apply 
the above criterion iteratively to find the smallest L for a given component code such that Wt(X.ij) > 
B(Tc, Ts). When L is large enough, e.g., L > 2(Tc + lcm(Tc, Tg)), the constraints imposed by the above 
lemma are relatively easy to meet, i.e., a iTintra that satisfies these constraints is easy to find. For 
example, it just has to permute the bits near both ends of a block to those places far away from the 
ends. 
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C.2 TB-IBPTC 

Denote by scrh\^ the tail-biting scrambler of length /, and define 

Si{l) = {M = 2 + 2scrblf,{ui) \0 < i < 1} 

S2il) = {M = 2 + scrb\,{uij)\tooj,tooj±l,0<t,j<l} (34) 

oo 

Sk = [j[Si{l)US2{l)] (35) 

l=k 

and let be the smallest integer of the set Sk- Obviously, {nik} is a nondecreasing function of k. 
Denote the least integer k such that > B{Tc,Ts) by /cmm- 

We observed that, for a TB-IBPTC whose block size L > kmin, a weight-2 sequence Uij generates 
a codeword whose weight is less than the bound B{Tc,Ts) only if {i,j) G Tt^ and (i,j) satisfies the 
following conditions: 

min{||(|z-j|)|k,||(L-K-j|)lk} = (36) 
min{||(|7ribp(z) -7ribp(j)|)||r,,||(^- |vribp(z) -7ribp(j)|)||rJ = 0. (37) 

min{|z - j\,L - \i - j\} + mm{\nibp{i) - iTibp{j)\,L- \nibp{i) - Tiibp{j)\} < lcm(Tc,Ts) + Tc. (38) 
Such pairs will not exist if iTinter is of Type ///and the corresponding TTbiock satisfies 

Uihj) + U{fib{nL + TTuockii)), fib{nL + nuockU))) > + lcm{Tc,Ts), 0<i,j < L, 

V n and G Tt^. In manner similar to the TP-IBPTC case, the above constraint on nuock can be 
further lessened when a Type /F inter-block interleaver is used. In summary. 

Lemma 4: For a TB-IBPTC that uses a Type /F inter-block interleaver with a block length L > kmin, 
W2,min > B(Tc,Ts) if the corresponding iruock satisfies 

f4.{i,j) + fi{T^biock{i), T^biockii)) >Tc + lcm(Tc, Ts). (39) 

for all (ij) e Ft,. 

Note that in designing the interleaver for the classic turbo codes that use the identical tail-biting 
convolutional code as the component codes, one must also consider the constraint similar to Lemma 4- 

C.3 C-IBPTC 

For the class of C-IBPTCs, we only have to consider (i, j) G Sm- Low weight codewords are associated 
with those pairs whose combined pre-interleaved and post-interleaved distance, |« — j| + \T^ibp{i) — 
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7rjfep(j)|, is small. The upper-bound promised by Theorem 2 can be achieved if 

fS^j) + hifibinL + TVuockii)), fib{nL + ■KuockH))) >Tc + lcm{Tc, T,), (40) 

for all n and (z, j) ^ Fy^, if vTj^ter is of Type III. 

The constraint (40) is used to ensure that the pair {nibp{i) , T^ibpiJ)) though in different blocks (since 
ihj) ^ r^J are separated by a large distance. 

In analogy to the case of the TB-IBPTC, the constraint on nuock can be relaxed if the corresponding 
TTinter IS more restricted. It is easy to show 

Lemma 5: For a C-IBPTC that uses a Type IV inter-block interleaver, if the associated Hintra is such 
that for all ^ r^^, 

f^iij) + fA{T^biock{i), T^biockii)) >Tc + lcm(Tc, Ts), 

then W2,min > B{Tc,Ts). 

C.4 Finite-length IBPTCs 

To accommodate the IBP ranges defined by (20) for the finite-length inputs, the range of fin{i) and 
the period Tg of a Type II or /// inter-block interleaver must be adjusted according to 

max {-Sb, -\i\L) < fin{i) < min {Sf,N -1 - \i\L) , (41) 

n + Sf + 1, ifO <n< Sb 
TAn) = l N_n + Sb, if N - Sf <n< N , (42) 
Sf + Sb + l, otherwise 

where < n < A^. Even with the above modifications, low-weight codewords can still be generated for 
some weight-2 input sequences. A simple solution is to adjust the block lengths of the beginning Sb 
and the last Sf blocks such that the block length of the ith block satisfies 

nTs{n) <L{n)< nTs{n) + T,(n), (43) 

for some n. 

Lemma 6: For an A^-block IBPTC whose block lengths L{n) are given by (43) and whose iTinter is of 
Type IV IBP with local interleaving periods defined by (42), 

Tc + lcm(Tc,Ts{n)) 



min wt(X.ij) > 2 + a 



T 

-'-c 



2(3. (44) 
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Finite-length versions of Lemmas 3-4 can also be established if the block length and the corresponding 
IBP rule meet the requirements stated in the above lemma. For a C-IBPTC, however, Ubiock needs to 
satisfy the additional requirement that for all < i, j < L such that Wnbiockii) — '^biock{j)\\Ts = 

'Tc + lcm{Tc, Sb + 1) 



fiihj) + fiiniockii), niockU)) >2 + a 
When this requirement is also met then we have 



+ 2(3. (45) 



min Wt(Xjj) > 2 + a 



Tc + lcm{Tc, Sb + 1' 



+ 2p. (46) 



T 

The above discussion shows that the Type IV inter-block interleavers do possess some desired prop- 
erties and should be used in conjunction with a proper intra-block interleaver. Hokfelt et al. [7] showed 
that, as the correlation function of the extrinsic output is exponentially decayed, the interleaver should 
separate neighboring bits as far as possible. The local periodicity requirement of the Type IV inter- 
leavers is consistent with this intuition and let bits or samples within the neighborhood of 5*/ + Sb 
blocks be moved to the different blocks. 

V. Upper-bounds of codeword weights for weight-2 and weight-4 input sequences 

This section derives upper-bounds for the weights of IBPTC codewords associated with weight-2 and 
weight-4 input sequences. These upper-bounds are valid for all intra- and inter-block interleavers. 

Recall that Lemma 1 implies that, the minimum codeword weight, W2^min, for the weight-2 input 
sequences whose coordinates of nonzero elements satisfy i ~ j and 7r(i) ~ 7r(j) is upper-bounded 
by 

< 2 + a . (^\l^jl±m^JlM-J + 2,8. (47) 

where it is understood that the constants a and (3 might not have the same values as those of (17). A 
bound much tighter than (47) can be obtained by applying the approach suggested by Breiling [28] who 
partitions the coordinates set associated with both pre-interleaved and post-interleaved sequences into 
equivalence classes induced by the equivalent relation Each equivalence class is further divided 

into subsets Fz = {z + fnTc, m = 0, 1, ■ ■ ■ , iFz] — 1}, where z is the smallest index in Fz- 

An output (parity) sequence will be of finite weight if the coordinate pair associated with the 
weight-2 input sequence Uij belongs to the same equivalence class. The parity sequence weight is small 
if the pair besides being in the same equivalence class, are in the proximity of each other, i.e., if 

G Fz for some z and the width of F z = {\F z\ — l)^c is small. 
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To avoid generating low-weight codewords, therefore, an optimal interleaver should send any pair 
of coordinates in a given subset to different equivalent classes and, if that is not possible, to different 
subsets or at least to two far-apart coordinates within a subset. Let Fi™"* and A^, (0 < 2; < A^) be the 
subsets and the number of subsets associated with the coordinates of the mth component encoder input 
sequence. The cardinalities of the A^ subsets differ at most by 1, i.e., = [L/AmJ or [L/AmJ + 1. 

Invoking the aforementioned pigeonhole principle, Breiling showed that if the pair (Ai, A2) is such that 
[L/Ai] > A2 then any interleaver would map a pair of coordinates that lies in the same subset 
F^z^ to (7r(z), 7r(j)) which also belongs to an identical subset F^^\ resulting in 



Wt(Xjj) <2 + a 





" L ' 




" L ' 




( 




+ 








Ai 




A2 





+ 2/3. 



(48) 



Minimizing the right hand side of the above inequality with respect to the the pair (Ai, A2), Breiling 
then obtained a very tight upper-bound. 

A. Upper- bound for weight-2 input sequences 

It is clear that, given the same set of parameters {L, Aj,Tc, an IBP interleaver has subsets 

within its span to choose from for placing members of the set G fI^^}, for some Q < z < L. Thus, 

assuming a large enough block size (L), the priority of an optimal IBP rule in permuting coordinates 
of the same equivalence class follows the order: (i) to different blocks, (ii) to different equivalent 
classes of the same block, (iii) to different subsets of the same equivalent class, and finally, (iv) to 
far-apart coordinates within the same subset. Obviously, the partition of an equivalence class into 
subsets plays a pivotal role in optimizing an IBP rule. With a minimum loss of generality, we assume 
||Ai||a/ = IIA2IIA/ = 0, M = TgTc, where Tg = 2S + 1. Given these parameter values, we consider the 
following (subset) partition. 



r 



(fc) 



\i\\M + \i 



\i\\M + [||-^||ai. 





L 




Afe 



\L 



+ Mj : 0<j< 



where < z < ||Iv||Afe — ||-^^||m- 



Afc 
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\L\ 



M 

M<i<Ak- M. 



NIm + [ll-^IUfe ~ ll-^IU/] 



Afe 



[Aa 



I-^IIa. 



where A^ - M < z < A^ - (M - HLUm)- 
{Mm+[\\L\\a, - \\L\\m] \^] + [Afc - ||L||a, 
where A^. - (M - | |m) < ^ < A^. 
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An exemplary partition of (49) is shown in Fig. 6 where the integers represent the coordinates of either 
an input or output sequence and each row consists of three segments with a segment representing a 
subset of size 3 or 2. The IBP rule sends bits in rows labelled by different capital letters to different 
blocks while those in the same row are interleaved to the same block. 

By using an argument similar to that leading to (48) and invoking the partition of (49) along with 
the permutation rule (i)-(iv) mentioned at the beginning paragraph of this subsection, we obtain 

Theorem 3: For the class of IBPTCs, the minimum codeword weight W2,min for weight-2 input se- 
quences is upper-bounded by 



W2,min < 2 + a 

where (Ai, As) G x D = {1, 2, ■ 
When Ai = A2, we have 

W2,mi 



mm 

(Ai.Aa) 



TsL 
Ai 



+ 



A, 



-2 +2/3 



(50) 



\L/M~\ - 1}, if L > MTc and 



Ai 



> 



A2 



< 2 + 2a- 



\/TsL — TgTc 



+ 2/3, 



(51) 



Proof: (51) follows directly from the partition (49) and the optimal periodic IBP. The corre- 
sponding interleaver results in bound-achieving codewords Xjj, G s^, when [^] > Hence 



mm 

(Ai,A2) 



Ai 



+ 



A, 




The upper-bound (50) can be rewritten as 



W2-. 









min < 


+ 


[a. \ 


Ai 





2] + 2/3 



(52) 



(53) 



If we choose (Ai,A2) = (Ao,Ao) with Aq = M([^^] - 1), i.e., Aq is a multiple of M and Ag < TsL 
then (50) implies 
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Theorem 3 implies that W2,mm grows hnearly with \/T\L when L is large. 

B. Upper- bound for weight-^ input sequences 

Let the coordinates of nonzero elements of a weight-4 input sequence be {i,j,k,l), where i < j < 
k < I. If we divide these coordinates and their permuted positions respectively into two pairs each 
according to their natural order, i.e., {i,j),{k,l) and say, (71(2), 7r(A;)), (7r(j), 7r(/)), then a low-weight 
codeword results if each pair belongs to the same subset. More specifically, the minimum codeword 
weight, w^^min, for weight-4 input sequences whose nonzero coordinates {i,j,k,l) are such that i ~ 
j, k I, 7i{i) ~ vr(/c), 7r(j) ~ 7r(/) satisfies 

«.4,„.„ < 4 + g . f I' - ^1 + - 'I + - "'^'1 + l'^'^'^ - '^'"h + 2P (55) 



T 

c 



or 



W4,min <4: + a- [ ) + 2fj. (56) 



if k, I) are such that i ^ j, k I, 7r{i) ~ vr(j), 7r(fc) ~ 7r(/). 

These upper-bounds are obtained by considering the three pre- and post-interleaving distributions of 
the 4-tuple k, I) shown in Fig. 7 (a)-(c). These three are the distributions that most likely lead to 
low- weight codewords. There are other candidate distributions (e.g.. Fig. 7 (d)) but the corresponding 
upper bounds are likely to be larger that those given by (55) and (56). 

Following an approach similar to that of [26] and taking into account the extra degrees of freedom 
offered by an IBP interleaver, we obtain 

Theorem J^: The IBPTC minimum codeword weight for weight-4 input sequences is upper-bounded 

by 



W4,mm < 4 + 2a min 

.(Ai,A2) 



Ai 



A2 



4/3 (57) 



when (Ai,A2) e D D, where D = {1,2,-- - , [L/2\}, satisfies (i) Aig) > ), (ii) ■ Aifi^ > 

(y^T^ ' ^"^^ l|Aj||M = 0, i = 1,2, where Vt = [^J and k = 1,2, ■■■ ,Ts — 1. Moreover, for the 
special case, Ai = A2 and if L > + — ^ and Ts > I the upper-bound yields the compact 
expression 

T L 

W4,min,ibp < 4 + 4a + 4/3, (58) 



22 



where 



^ - iW(ir-(ir-fi-v(ir-(ir (-> 

Pi = 3T^L-^{T, + 2T^f, (60) 
gi = -T^L' + {T! + 2T^)L-^{Ts + 2T!)\ (61) 



Proof: See Appendix B. ■ 
Again, we observe that for large L, the upper-bound grows hnearly with {TsL)^ . The minimum 
codeword weights associated with weight-2 and weight-4 input sequences are upper-bounded by the 
increasing functions of T^L. 

C. Interleaving gain comparison 

As shown in Section III, except for the first output block, an IBPTC decoder yields an inter-block 
decoding delay the same as that of a classic TC with the same block size. However, the encoding delay 
or the SRID of an IBPTC is (1 + S*) times larger. Let W2,min,biock and W4^rnin,biock be the minimum 
codeword weights associated with weight-2 and weight-4 input sequences of classic TCs with block size 
(S + 1)L, then we have [28] 

W2,min,block < 2 + 2a^=^^^—- + 2f3 (62) 

y (S + 1)L — Tc 

WA,min,biock < 4: + Aa— ^ . 2 H ^ ^Z^" (^3) 

((5 + 1)L - 1) 3 - ((5 -M)L - 1) 3 + 1 - T, 

Comparing the above equations with (51) and (58) and noting that Tg = 2S'-|-1, we conclude that, as far 
as weight-2 and weight-4 input sequences are concerned, a 'good' IBPTC can bring about improvement 
factors of (2 — ^ and (2 — -^^) , respectively. 

VI. IBPTC ARCHITECTURE AND ALGORITHMS 

A. Implementation concern of IBPI 

We have defined a swap interleaver as one such that V i 7r{i) = 7r~^(i). As an IBPI moves bits 
or symbols in a given block to positions within itself and those in the neighboring Sf + Sb blocks, 
the associated interleaver normally requires at least {Sf + Sf, + 1)L units of memory (see Fig. 8(a)), 
where the number of bits per unit depends on the system's precision requirement. However, if we use 
a symmetric {Sf = St = S) swap interleaver, then only {S + 1)L units of memory are needed for 
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temporary interleaving or deinterleaving storage. As shown in Fig. 8(b), we do not have to move those 
forwardly-permuted symbols until after all earlier (backward) blocks have been filled by interleaved 
(or deinterleaved) symbols and after their contents have been dumped. Moreover, a symmetric swap 
interleaver has the same interleaving and deinterleavering structure and can be implemented by single 
permutation table or algorithm. These advantages of symmetric swap IBPIs will still be maintained 
when we consider the implementation of the combined intra- and inter-block permutations. An IBPI 
using the swap structure has only to perform memory content swapping between current block and 
the backward blocks. Furthermore, if Hinter is a Type /F interleaver, the only IBP operation is simply 
m ^ m — nL, where n G {1, 2, ■ ■ ■ , 5}. 

Theorems 1 and 2 give us some guidelines for designing an IBP algorithm. In the previous paragraph, 
we show that the IBP with the swap structure has an implementation edge. Shown in Table I is a 
symmetric IBPI with S = Sf = Sb and SRID = (5-1- 1)L. It can be easily seen that 

Corollary 3: The algorithm in Table I satisfies the requirements of both Type IV and Type V inter- 
leavers. 

B. Modified semi-random interleaver 

Semi-random interleavers [10] are designed to eliminate "short cycles" that send two close-by bits 
to the vicinity of each other after interleaving. These interleavers are, however, originally designed 
to work in the block interleaving setting, therefore they can not avoid two new classes of short cycles 
arising in TB-IBPTCs and C-IBPTCs. A tail-biting convolutional code begins and ends at the same 
state, hence if two close-by bits in a block are respectively intra-block permuted to the beginning and 
the ending parts of that block, and if the two bits remain in the same block after the IBP interleaving, 
a short cycle will result as the proposed IBP does not alter their relative positions within a block. For 
the class of C-IBPTCs, we also want to prevent similar intra-block interleaving results because the 
IBPI may send such a pair to the ending and beginning parts of two neighboring blocks. We therefore 
modify the constraint of [10] as 

(7r(^),7r(j))>52, 0<t,j<L (64) 

where dminihj) = niin(|i— j|, L—\i—j\). This new constraint excludes the possibility that two symbols 
at the beginning and the ending parts of a block would remain there after the interleaving. 
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VII. Simulation Results 

A. Error probability performance 

Computer simulation results reported in this section use the RSC code of the 3GPP standard, 
G{D) = i^^2^3 [27], the interleaver of the same standard or the modified semi-random interleaver of 
(64) for the intra-block permutation while the IBP follows the algorithm of Table I. Each simulation 
run consists of 1000 blocks. We use the Log-MAP or MAX-Log-MAP algorithms for decoding classic 
TCs and TP-IBPTCs, the sliding-window Log-MAP or the sliding-window MAX-Log-MAP algorithms 
for decoding TB-IBPTCs and C-IBPTCs. In most cases, we compare the performance of classic TCs 
and IBPTCs under the assumption that either both codes have the same encoding delay, SRID or they 
have the same average IBDD. As discussed in Section II, the latter implies that both codes use a single 
APP decoder and identical block size L, and the former case implies that the classic TC uses a block 
size of (1 + S)L while the IBPTC has a block size of L. 

Figs. 9 and 10 show the BER performance of rate 1/3 turbo coded systems with 10 iterations 
and Log-MAP algorithm. The interleaver parameter values for the IBPTC are L = 402, S = 1 or 
L = 265, 5 = 2. Compared with the performance of the classic TC with L = 400, the IBPTCs yield 
0.7-0.9 dB performance gain at BER=10~^ and 1.0-1.2 dB gain at BER=10~^. When both codes have 
the same SRID, the IBPTC provides 0.4-0.6 dB performance gain at BER between 10"^ and 10"^ 

Figs 11 and 12 show the BER performance of rate 1/2 turbo coded systems. The MAX-Log-MAP 
algorithm is used in this example. We compare the performance of the classic TC with L = 1320 
and the IBPTCs with L = 660, 5 = 1 and L = 440, S = 2. Using L = 660, 5 = 1 and the 3GPP 
interleaver as the intra-block interleaver, the IBPTCs have 0.4-0.45 dB and 0.3 dB gain at BER=10~^ 
and 10~^, respectively. For other cases, the IBPTCs give 0.4-0.45 dB gain at BER=10~^ and 0.4-0.6 dB 
gain (except for the case TP-IBPTC with L = 440, 5 = 2) at BER=10-*^. It is clear that the IBPTCs 
outperform the conventional TCs with nearly the same SRID. Furthermore, the proposed modified 
s-random interleaver outperforms the 3GPP defined interleaver, especially when the interleaver span is 
small {S = 1). 

These figures reveal that the proposed IBPTCs yield superior performance, sharper slope of the BER 
curve at the waterfall region and lower error floor when compared with the corresponding performance 
curves of the classic TCs for a variety of different code rates and decoding algorithms. The improvement 
is more impressive for smaller SRIDs, and with the same SRID, a larger interleaver span (S) leads to 
better performance. 
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Fig. 13 shows the BER performance of rate 1/3 IBPTCs that use the 3GPP defined interleaver as 
the intra-block interleaver. Either the Log-MAP algorithm or the Log-MAP algorithm is used and 15 
decoding iterations is assumed. All these IBP parameter values, {L,S) = (660,1), (440,2) or (330,3), 
give the same SRID of 1320 samples. The performance is consistent with our prediction: the larger the 
interleaver span is, the better the system performance becomes. The performance deteriorates when 
the period of encoder, T^, and the period of the IBPI, T^, are the same. For this case the lower-bound 
of (21) becomes 2(1 + a + l3) which is much smaller than the corresponding upper-bound given in 
Theorem 2. By contrast, the two bounds are much closer if Tc ^ Tg and both bounds give identical 
value if and Tg are relative prime. 

Finally, we want to show that the IBPTC requires an interleaver latency much smaller than that 
of classic TCs with similar BER performance. Fig. 14 shows the BER performance of rate 1/3 turbo 
coded systems that employ 10 decoding iterations and the Log-MAP algorithm. All the interleavers 
are taken from the 3GPP defined interleaver. The average interleaver and deinterleaver latency of the 
IBPTCs is about 800. It is observed that the performance of the IBPTCs is bounded by those of turbo 
codes with block size L = 2800 and L = 3600. In other words, an IBPTCs achieves BER performance 
similar to that of a classic TC which requires an interleaving latency 3.5 to 4.5 times longer. 

All these figures show that the TB-IBPTC has the best performance, followed by the C-IBPTC and 
then the TP-IBPTC. 

B. Covariance and convergence behavior 

Fig. 15 shows the covariance behavior for both IBPTC and classic TC with the same SRID, where the 
IBPI has S* = 1 and SRID = 800 and the interleaving depth for the classic TC is L = 800. It indicates 
that the covariance is small for the IBPTC even at SNR = 0.5 dB while much higher covariance is 
observed for the classic TC at much higher SNR. The IBP collects extrinsic information from farther 
and farther away as the number of iterations increases and we have expected that it results in smaller 
covariance. 

Two similar techniques have been proposed to study the convergence behavior of iterative decoding 
schemes, namely, the extrinsic information transfer chart (EXIT chart) of [23], [22] and the extrinsic 
information SNR evolution chart of [21]. The latter is a simplified version of Richardson's density 
evolution approach [24]. As the extrinsic information in the iterative decoding can be approximated 
by a Gaussian random variable, the evolution of the corresponding probability density as a function 
of iteration can be characterized by the SNR evolution where SNR is defined as the squared mean to 
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variance ratio of the density. 

Fig. 17 compares the EXIT behavior of our proposal and the classic TC with the same SRID. The 
IBPTC yields mutual information almost equal to one at SNR = 0.5 and 1.0 dB, but the classic TC 
needs SNR = 2.0 dB to achieve the same performance. The SNR evolution chart shown in Fig. 16 
exhibits similar behavior of the two codes, all indicating the proposed IBPTC gives superior perfor- 
mance. Both figures also reveal that our code has a much faster convergence speed. The much larger 
step of the IBPTC curves means the associated APP decoder generates more information or extrinsic 
information with larger signal to noise ratio for the next stage decoder. Such a trend has been expected 
when we examine the factor graph structure of the IBPTC in Fig. 2. 

VIII. Conclusion 

We present a class of IBP interleavers that enables an iterative decoder to collect information from 
a large span of neighboring samples with a bounded SRID or average IBDD. We derive the worst- 
case codeword weight upper bound for the weight-2 input sequences and provide constraints on the 
selection of the associated intra-block interleaver. The codeword weight upper-bounds for the weight-2 
and weight-4 input sequences, when we have the freedom to select both the inter- and intra-block 
interleavers, are also given. It is shown that these bounds are better than those of the classic TCs 
with the same SRID. Our analysis also indicates that an IBP rule that possesses some regularities like 
periodicity and symmetry is likely to be a good IBP though global optimality has not been established. 

Using some of the properties and bounds we derived as design guidelines, we propose a simple IBP 
algorithm, suggest a modified semi-random intra-block interleaver and address some implementation 
and hardware architecture issues. Simulation results based on the 3GPP standard turbo component 
code show that the IBPTCs provide 0.3 ~ 1.2 dB performance gain. The performance curves have 
sharper slopes in the waterfall region with respect to those of the classic TCs with the same SRID or 
average IBDD. The class of IBPTCs achieve the same performance as that of the classic TC with a 
much a reduced SRID. 

The class of proposed IBPTCs provides flexibility and tradeoffs that are not found in the classic 
TCs. In particular, it possesses some desired features that suit high data rate applications naturally. 
Since it can use any existing block-wise interleaver as its intra-block interleaver, the encoder /decoder 
structure is backward compatible in the sense that the special case S = 1 degenerates to the classic 
TC structure. 
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Appendix 
Proof of Theorem 4 

We first notice that, besides those finite weight codewords resulting from termination, as illustrated 
in Fig. 7, there are three conditions under which a weight-4 input sequence of an IBPTC will generate 
a finite- weight codeword. In Case (a), the codeword consists of two finite- weight segments (in different 
blocks) generated respectively by two weight-2 input sequences and thus the corresponding codeword 
weight upper-bound is simply twice that given in Theorem 3. Case (b) considers the situation when 
two pairs of coordinates from rf^ and rf^ of either the same block or different blocks are permuted 

(2) (2) 

to the same block with one coordinate from each pair mapped to two subsets Fj. and , where the 
pair {k,l),k ^ I belongs to the same equivalence class while the remaining two coordinates mapped to 

(2) (2) 

another two subsets Fm and Fn with m 7^ n in another equivalence class. Case (c) is similar to Case 
(b) except that the two subsets that contain the two permuted pairs are in different blocks. 

Note that ii k = I and m = n then the both cases will result in a codeword weight upper-bound 
similar to that obtained in [28]. But this is impossible as coordinates from different blocks will not 
be mapped into coordinates in the same subset (defined by (49)) by an optimal interleaver. This is 
because the spatial symmetric structure of a classic TC implies that, for every input sequence u of 
the code X that uses the interleaver tt, 3 u' such that the codewords generated by (u, tt) and (u', tt^^) 
have identical weight. This observation and the fact that both component encoder outputs, and x^, 
contribute equally to the resulting codeword weight suggest that vr and vr^^ have the same effect on 
the weight distribution, and that optimizing the deinterleaver rule results in the same mapping as the 
optimal interleaver. 

Since we have to consider the scenario k 7^ / and m ^ n only, the worst case occurs when both | — / | 

and \m — n\ are less than T^T^. In other words. Cases (b) and (c) concern the situation in which the 

~ (2^ 

pairs {nibp{x),TTibp{w)) and {7rn,p{y) , 7iibp{z)) belong to distinct supersubsets where a supersubset Fj 

consists of M/Tg = consecutive subsets of the same equivalence class. Each block therefore has y' 

(2) (2) ~ (2) 

supersubsets, and F). and F] are in the same supersubset Fj if \k\M = \1\m = j and \ \k — = 0, 

or equivalents, = Ur=o' r[|]||^^+,7,^+|^.|^^Af' ^ = 1,2. 

Let Ai and A2 be the number of coordinates subsets per block for the input and permuted sequences. 
The subset partition rule, (49), implies that f2 < \f[''^\ < + 1, where = [^J. For Case (b), 
each subset has either {^2^) or (2) distinct coordinates pairs and each block has at least such 
pairs. Our IBP interleaver maps ^ sets of coordinates to each block within its span, or equivalently. 
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a block "receives" coordinates from Tg neighboring blocks. The optimal IBP rule would map a pair of 
coordinates in the same subset to different equivalent classes or blocks and, when this is not possible, 
to different supersubsets of the same block. 

A pair of coordinates (z, j) in rf'^ can be mapped to any one of the (^) pairs of distinct supersubsets 
rf\r^^\j ^ k oi a. neighboring block. A periodic IBP requires that at least T^^^i^ distinct pairs 
of coordinates from neighboring blocks be permuted to the same block. The pigeonholes principle 
implies that Case (b) will occur if 



2 J \2 ^ 

For Case (c) the pairs (nibpix) , 7iibp{w)) and {nibp^y) , 7iip{z)) are in two distinct blocks. If the 
two distinct blocks are separated by k blocks {k = 1 means they are two successive blocks), then 
(7ri6p(x), 7iii,p{w)) , (iTibpiy), T^ipiz)) are mapped from Tg — k neighboring blocks in which each block con- 
tains supersubsets and each supersubset has at most (fi + 1)^ and at least coordinates pairs to 
the two designated blocks. Therefore, finite weight codewords result if 



2 



(Ts - k)j^n' > ] (A.2) 



and we obtain upper-bounded 



W4,min,ibp <4: + 2a- min 

\(Ai,A2) 



Ai 



+ 



Tg-L 
A, 



4/3 (A.3) 



A2 

where (Ai,A2) are subject to the constraints, (CI): ||Ai||a/ = ||A2||Af = 0, (C2): Ai(2) > {^), and 
(C3): ^^^Aifi2 > (^^y. Since = [^J > ^ - 1, we rewrite (A.l) and (A.2) as 



We carry out the minimization with respect to (Ai, A2) by first finding the two minimums with respect 
to the constraints (C1)/(C2) and (C1)/(C3), respectively, and then select the smaller one of these two. 
Using the simplified assumption [28] that the cardinalities of Ai and A2 are the same and to distinguish 
the two candidate minimums, we set Ai = A2 = A3 in (A. 5) and Ai = A2 = A4 in (A. 5) so that the 
above two inequalities become 

A^ - {Ts + 2T^)Al + ST^AsL - T^L^ < (A.6) 
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- Ts{Ts - k)Al + 2Ts{T, - k)A^L - r,(r, - k)L^ < (A.7) 
By defining Xi = A3 ^ and X2 = A4 " g~ , we rewrite the above inequalities as 

X'f + { 3T^L - i(r, + 2T^f^ Xi + (^-T^L^ + (T^ + 2T^)L - ^(T, + 2T^f ) < (Ai 

Xl + (^2T,(T, - k)L - ^T,2(T, - kf^ X2 + (^-T,(T, - k)L' + ^T2(T, - A;)^^ - ^Ti'(T, - A;)='^ < 

(A.9) 

Following the standard procedure for solving a cubic equation [30], we define 

Pi = 3T^L-^{T, + 2T^f, 

q, = -r^L' + {T^ + 2Tt)L-^{T, + 2T^)\ 

P2 = 2T,{T,-k)L-^-T^{T,-k)\ 

q2 = -UT,-k)L^ + '^T^{T,-kfL-^T^{T,-kf. 



If L > f T3 + T2 - ^, then 



3 s 

Pi = ^T^L-\{T, + 2T^,f = T', [?,L--T^ --T,-\ 

> (^lOT^ + - 1 - ^T^ - It, - > (A.IO) 
P2 = 2Ts{Ts-k)L-^T^iT,-kf = 2T,iT,-k)(^L-^T,{T,-k) 

> 2r,(T,-A;) (^L-^r,(r,-i)^ >0 (A.ll) 

= -T^L' + (T! + 2T^)L-^iT, + 2T^f<-T^L' + {T! + 2T^)L 

/in 7^ \ 

= -T^L{L - T, - 2r2) < -T^L i —T^ + - ^ - T, - j < (A.12) 

q2 = -Ts{Ts-k)L'' + '^T'^{T,-kfL-^T^^{Ts-kf<-T,{Ts-k)L^ + ^^^^ 



(A.13) 



/ 2 \ /lO T 2 \ 

< -T,(T, - k)L [L - -T^j < -T,(T, - (^yTi' + T ^ - ^ - -T^J < 

P1-P2 = 3T2L-i(T, + 2T2)2-2T,(T,-A;)L + ^T2(T,-A;)2 

> 3T,2L - ^(T, + 2T^f - 2T^L + ^T^ = T.^L - ^T^ - 1t, 
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92 - gi = -T,(T, - k)L' + ^T2(T, - kYL - ^T^iT, - kf + T^L^ - (Tj' + 2T,^)L + ^(T. + 2X^)3 

> LT,(jT! + T^-^-iT^ + 2T!)'^>0. 
These results imply (2^)3 + (f )2 > 0, )3 + (f )2 > and 



7^ , ^ 

^ - 3 V 2 



A. < 



T.(T,-A;) 



It can be shown that 



G 



' 2 



gi 

2 



f)^ + (|)^>0 



H 



3/ 92 , /,P2s3 ,g2^2 , 3/ 92 



;|)3 + (|)2>o 



(A.15) 



(A.16) 
(A.17) 



(A.18) 



(A.19) 



are zeros of /(x) = x'^ + pix + qi and h{x) = x^ + p2X + 92, respectively. As both /(x) and g{x) are 
monotonically increasing functions and f'{x) = 3x^+pi > g'{x) = 3x^+^2 > 0, Vx, /(x) < g{x),W x < 
X, where x is the single intersection point given by 

92 - 91 



X 



Pi -P2 



> 0. 



The fact that 



./-^N /92 -91n3 , 92-91 , 

= iz — ) + PiZ — + 1i 



■ 92-91 x3 ^ Pi 92 - P291 ^ / 92 - 9i x3 ^ P292 - P291 ^ Q 



Pi -P2 Pi- P2 Pi -P2 Pi- P2 Pi -P2 Pi- P2 

implies that the only real zero of /(x), G, is larger than that of g{x), H, and thus C > D. 

Substituting Aj = C into (A. 3), we obtain an upper-bound with a very complicated expression. To 
have an upper-bound with a simpler form, we notice that ||Aj||M=Tcrs = gives 

C 



max Aj = TgTc 
Hence a less tight upper-bound is given by 



A. 



>C- T,T, 



WA,min,ibp < 4 + 4a ( niin 



A(3 = A + Aa 



T.Tr 



c 



< 4 + 4a 



C - T^Tr 



1 + 4/5 < 4 + 4a 



C - TsT, 



+ 4/3 



(A.20) 



4/3 



(A.21) 
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The upper-bound of weight-4 input sequence in the Case (a) is twice the upper-bound of weight-2 input 
sequence shown in (51). 
Note that 



E^JC- V^^^ + f I + + (|)^ + f I - J(P + (|)^ - (A.22) 



and 



E - + ' = -gi - - + v^L ) . (A.23) 



In other words, £" is a zero of the polynomial 



which, like f{x) defined before, is a monotonically increasing function and has only one real zero. For 



Ts>2, 



9(0) = (-^^^ + V^) +Pi[-^^'^+VTsL)+q, 
= -T^L^ + (t! + lI - (T^ + 2T^)L 



3 



1 



< I -T^ { + T^-^y + (tI + 3T/ ) 1 < (A.24) 

3 o 

The last inequality holds because both {^T^ + — and + 3T/ are positive real numbers 
and 

Tf(^T! + T^-^]-iT} + 3T}y = ^tJ + T!-^-9T!-6T^-T! 



3 . . . 3 y V . . . . 3 ^ * 3 

40 

> -T! + 2T!-^-9T!-6T^-T! 

> 12T^ - 6T^ - > 0. 

Hence E is positive and so 
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TABLE I 
IBP Algorithm 

Variables 
L-block length 
N-total number of blocks 
K-block number index 

D(m,k)-data on the kth block mth position 
Recursion 
for K=0 to N-1 
for i=0 to i=S-l 
if (K-i > 0) 

if (K mod (2-(i+l)) < i+1) 

set m=2-i+l 
else 

set m=2-i+2 
end 

while (m < L) 

swap D(m,K) and D(m,K-i-l) 
set m=m+2S+l 
end 
end 
end 

end 
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Fig. 1. An inter- block permutation interleaving procedure. 
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Fig. 2. Factor graph representation and information flow of an IBPTC and a classic TC. 
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Fig. 3. A comparison of exemplary decoding schedules for classic TC and IBPTC when decoding 7 blocks with 2 
iterations (four decoding rounds). The numbers in the two rectangular grid-like tables represent the order the APP 
decoder performs decoding. Hence the first block of the classic TC is decoded by the first 4 decoding rounds (the 
leftmost column) but that of the IBPTC is decoded by the first, third, sixth and tenth decoding rounds; see Section III 
for detailed discussion. 
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Fig. 4. Four-processor decoding schedule for an IBPTC where indicates that the APP decoder z is performing the 
jth decoding round of its ith decoding phase; each arrowed dashed slant line from upper-right to lower-left represents a 
decoding phase for a certain APP decoder. 
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Fig. 5. (a) An IBPTC decoding module for 1 iteration; (b) An IBPTC pipeline decoder, yl is the received sample 
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fcth interleaved information bit ui. 
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Fig. 6. Partition of equivalence classes into subsets and IBP interleaving; L = 68, A = 27, Tc — Tg — 3. 
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Fig. 7. Pre- and post-interleaving nonzero coordinate distributions of weight-4 input sequences that result in low-weight 
IBPTC codewords. 
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Fig. 8. (a) Intcr-block interleaving using a non-swap structure (S = 1); (b) inter-block interleaving using the swap 
structure. When one starts to interleave (or de-interleave) Block III, Block I has been completely interleaved (or de- 
interleaved), its content was dumped and the corresponding space is emptied and becomes available for storing new 
content again. The storage spaces enclosed by dotted ellipses are thus not needed. 
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Fig. 9. BER performance of IBPTCs with SRID w 800, block size L — 402 and interleaver span 5=1. For comparison 
purpose, performance of the classic TC with L = 400, 800 are also given. 




Fig. 10. BER performance of IBPTCs with SRID « 800, block size L = 265 and interleaver span 5* = 2. For comparison 
purpose, performance of the classic TC with L ~ 400, 800 arc also given. 
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Fig. 12. BER performance of IBPTCs and the classic TC with SRID = 1320 and the modified semi-random interleaver. 




Fig. 14. BER comparison of IBPTCs and the 3GPP defined turbo code of various block sizes. 
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Fig. 15. Covariance between a priori information input and extrinsic information output. 
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Fig. 16. SNR evolution chart behavior of the IBPTC and the classic TC at different Eb/No's. 
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Fig. 17. Exit chart performance of the IBPTC and the classic TC at different Sb/iVo's. 



